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Abstract: Medial temporal lobe (MTL) dependent long-term memory for novel events is modulated by 
a circuitry that also responds to reward and includes the ventral striatum, dopaminergic midbrain, 
and medial orbitofrontal cortex (mOFC). This common neural network may reflect a functional link 
between novelty and reward whereby novelty motivates exploration in the search for rewards; a link 
also termed novelty "exploration bonus." We used fMRI in a scene encoding paradigm to investigate 
the interaction between novelty and reward with a focus on neural signals akin to an exploration bo- 
nus. As expected, reward related long-term memory for the scenes (after 24 hours) strongly correlated 
with activity of MTL, ventral striatum, and substantia nigra/ventral tegmental area (SN/VTA). Fur- 
thermore, the hippocampus showed a main effect of novelty, the striatum showed a main effect of 
reward, and the mOFC signalled both novelty and reward. An interaction between novelty and reward 
akin to an exploration bonus was found in the hippocampus. These data suggest that MTL novelty sig- 
nals are interpreted in terms of their reward-predicting properties in the mOFC, which biases striatal 
reward responses. The striatum together with the SN/VTA then regulates MTL-dependent long-term 
memory formation and contextual exploration bonus signals in the hippocampus. Hum Brain Mapp 

33:1309-1324, 2012. © 2011 Wiley Periodicals, Inc. 
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INTRODUCTION 

Novelty is a motivationally salient learning signal that 
attracts attention, promotes memory encoding and modi- 
fies goal-directed behavior [Knight, 1996; Lisman and 
Grace, 2005; Mesulam, 1998; Sokolov, 1963]. Recent evi- 
dence from human and nonhuman primate studies raises 
the possibility that the motivational aspects of novelty 
partly relate to its shared properties with reward [Bunzeck 
and Duzel, 2006; Kakade and Dayan, 2002; Mesulam, 
1998]. This suggestion follows from observations that in 
animal studies the substantia nigra /ventral tegmental area 
(SN/VTA) of the midbrain is activated by stimuli that pre- 
dict rewards as well as stimuli that are novel [Ljungberg, 
et al. 1992]; for a review see [Lisman and Grace, 2005]. 
Similarly, the human SN/VTA is activated both by reward 
[Knutson and Cooper, 2005] and novelty [Bunzeck and 
Duzel, 2006; Bunzeck, et al. 2007; Wittmann, et al. 2005] as 
well as by cues predicting their occurrence [Knutson and 
Cooper, 2005; O'Doherty, et al. 2002; Wittmann, et al. 2005, 
2007]. The neurotransmitter dopamine that is produced in 
the SN/VTA profoundly regulates motivational aspects of 
behavior [Berridge, 2007; Niv, et al. 2007]. 

Furthermore, there is converging evidence that the hip- 
pocampus, a medial temporal lobe (MTL) structure, which 
is critical for the formation of long-term episodic memories 
for novel events, is also implicated in various forms of 
reward learning [Devenport, et al. 1981; Holscher, et al. 
2003; Ploghaus, et al. 2000; Purves, et al. 1995; Rolls and 
Xiang, 2005; Solomon, et al. 1986; Tabuchi, et al. 2000; 
Weiner, 2003; Wirth, et al. 2009]. For instance, the rodent 
hippocampus shows increased activity in baited but not 
unbaited maze arms [Holscher, et al. 2003]; in nonhuman 
primates it is involved in learning place reward associa- 
tions [Rolls and Xiang, 2005]; hippocampal activity follows 
prediction error learning rules for aversive stimuli in 
humans [Ploghaus, et al. 2000]; and reward increases syn- 
chronization between hippocampus and nucleus accum- 
bens neurons [Tabuchi, et al. 2000]. 

A commonality in the effects of reward and novelty can 
be reconciled theoretically by a suggestion that novelty 
acts to motivate exploration of an environment to harvest 
rewards [Kakade and Dayan, 2002]. According to this sug- 
gestion, a key motivational property of novelty is its 
potential to predict rewards, whereas familiar stimuli, if 
repeated in the absence of reward, gradually loose this 
potential. The exploration bonus hypothesis makes two 
types of predictions: a first one relates to the potency with 
which the status of being novel or familiar can predict 
reward and a second one relates to the contextually 
remote effects of this contingency on other stimuli. 
According to the first prediction, being a novel stimulus 
should be a more potent predictor of reward than being a 
familiar stimulus [e.g., Wittmann, et al. 2008]. That is, 
when novel stimuli predict reward, reward expectancy 
should be higher than when familiar stimuli predict 
rewards. The second (more indirect) prediction is that the 



motivationally enhancing effect of novelty on exploratory 
behavior should have a contextual effect on the motiva- 
tional significance of other stimuli that are present in the 
same context. Compatible with this suggestion, Bunzeck 
and Duzel [2006] showed that in a context in which novel 
stimuli are present, familiar stimuli show less repetition 
suppression in MTL structures. This suggests that even in 
the absence of explicit reward, in a context in which novel 
stimuli are present, there is a stronger motivation to 
explore also the familiar stimuli in that context [Bunzeck 
and Duzel, 2006]. However, to date, these predictions 
about the relationship between novelty and reward have 
not been tested directly. In experimental terms, this 
requires manipulating the reward-predicting property of 
novelty such that rewards in a given context are predicted 
either by being novel or by being familiar. Here, we used 
this experimental approach to investigate the functional 
interaction between novelty and reward in an fMRI study. 

Understanding the functional interaction between nov- 
elty and reward has profound implications for under- 
standing how long-term plasticity for novel stimuli is 
regulated. A large body of physiological evidence shows 
that dopamine originating from the SN/VTA not only reg- 
ulates motivational aspects of behavior but is critical for 
enhancing and stabilizing hippocampal plasticity [Frey 
and Morris, 1998; Li, et al. 2003] and hippocampus-de- 
pendent memory consolidation [O'Carroll, et al. 2006]. 
According to the so-called hippocampus- VTA loop model 
[Lisman and Grace, 2005] novelty signals are generated in 
the hippocampus and are conveyed to the SN/VTA 
through the nucleus accumbens and the ventral pallidum 
[Lisman and Grace, 2005]. Although the model emphasizes 
novelty itself as the key cognitive signal to modulate dopa- 
mine from the SN/VTA, it also explicitly raises the ques- 
tion how motivational factors regulate the impact of 
novelty on the activity of the hippocampus and the SN/ 
VTA. The goal of this study is to approach this question 
from the vantage point of shared properties between nov- 
elty and reward and their functional interaction. 

If novelty acts as a signal that motivates exploration to 
harvest rewards [Bunzeck and Duzel, 2006; Kakade and 
Dayan, 2002; Wittmann, et al. 2008] parts of the hippocam- 
pus-SN/VTA loop should only show a preferential 
response to novelty in a context where being novel pre- 
dicts rewards but not in a context where being familiar 
predicts reward. At the same time, the enhancement of ex- 
ploration when being novel is rewarded should boost hip- 
pocampal responses to familiar stimuli that are presented 
in the same context, even though these would not predict 
rewards. In contrast, in a context in which being familiar 
but not being novel predicts rewards, there should be less 
contextual motivation to explore and consequently hippo- 
campal activity should be low for both the novel and the 
familiar stimuli in that context. Hence, the hypothesis that 
novelty has an intrinsic property to motivate explorative 
behavior in the search for rewards leads to the prediction 
of an interaction between the novelty- and reward status 
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of stimuli. Accordingly, the hippocampus would respond 
strongly to both novel and familiar stimuli when being 
novel predicts reward and weakly to both novel and fa- 
miliar stimuli when being familiar predicts reward. 

The alternative possibility is that the novelty and 
reward-status of information is independent. According to 
this possibility, there should be no functional interaction 
between novelty and reward. In other words, parts of the 
hippocampus-SN/VTA loop would only express a main 
effect of novelty or reward but no interaction between 
both. 

Taken together, manipulating the contingency between 
novelty and rewards can help to understand the key 
mechanisms that drive novelty responses within the meso- 
limbic system. To that end, we developed a paradigm 
where receiving monetary reward was contingent upon 
the novelty status of images of scenes [Bunzeck, et al. 
2009]. Thus, making correct reward preference decisions 
(see methods) was only possible after correctly discrimi- 
nating novel and familiar stimuli. Importantly, we 
assessed recognition memory one day after encoding and 
thus were able to identify to what extent components of 
the hippocampal-SN/VTA loop would correlate with the 
reward-related enhancement of long-term memory for 
novel and familiar stimuli. 

MATERIALS AND METHODS 

Two experiments were performed. While the first 
experiment (Experiment 1) was a behavioral experiment 
the second experiment (Experiment 2) involved behavioral 
measures and fMRI. 

Subjects 

In Experiment 1, 17 adults participated (13 female and 
four male; age range 19-33 years; mean 23.1, SD = 4.73 
years) and 14 adults participated in Experiment 2 (five 
male and nine female; age range: 19-34 years; mean = 22.4 
years; SD = 3.8 years). All subjects were healthy, right- 
handed and had normal or corrected-to-normal acuity. 
None of the participants reported a history of neurological, 
psychiatric, or medical disorders or any current medical 
problems. All experiments were run with each subject's 
written informed consent and according to the local ethics 
clearance (University College London, UK). 

Experimental Design and Task 

In both experiments, three sets of (1) a familiarization 
phase followed by (2) a recognition memory based prefer- 
ence judgment task were performed. Here, new images 
were used for each set resulting in 120 novel and 120 fa- 
miliar images being used altogether. The experimental 
procedures were identical for both experiments except that 
Experiment 1 was performed on a computer screen and 



Experiment 2 was performed inside an MRI scanner. (3) 
On day two recognition memories for all presented images 
was tested using the "remember/know" procedure (see 
below). 

(1) Familiarization: Subjects were initially familiarized 
with a set of 40 images (20 indoor and 20 outdoor images). 
Here, each picture was presented twice in random order 
for 1.5 s with an interstimulus interval (ISI) of 3 s and sub- 
jects indicated the indoor /outdoor status using their right 
hand index and middle finger. (2) Recognition memory 
test: subsequently, subjects performed a 9 minute recogni- 
tion memory based preference judgment task (session). 
This part (session) was further subdivided into two blocks 
containing each 20 images from the familiarization phase 
(referred to as "familiar images") and 20 previously not 
presented images (referred to as "novel images"; subjects 
could pause for 20 s between blocks). In any given block 
either novel images served as CS+ and familiar images as 
CS— or vice versa (Fig. 1). Participants were instructed to 
make a "preference" judgment to each image via a two- 
choice button press indicating "I prefer" or "I do not pre- 
fer" depending on the contingency between novelty status 
and reinforcement value. Importantly, the term "pre- 
ferred" and "not-preferred" refers to the reward predicting 
status of the image (depending on the contextual contin- 
gency) rather than the aesthetic properties of the picture. 

The contingency was randomized and indicated on the 
screen prior to each run by either "Novelty will be 
rewarded if preferred" (in which case novel images served 
as CS+ and familiar images as CS— ) or "Familiarity will 
be rewarded if preferred" (here familiar images served as 
CS+ and novel images as CS— ). Only correct "I prefer" 
responses following a CS+ led to a win of £0.50 whereas 
(incorrect) "I prefer" responses following CS— led to a loss 
of £-0.10. Both correct "I do not prefer" responses follow- 
ing CS— and (incorrect) "I do not prefer" responses fol- 
lowing a CS+ led to neither win nor loss. Images were 
presented in random order for 1 s on a gray background 
followed by a white fixation cross for 2 s (ISI = 3 s). To 
ensure that neural reward responses were limited to the 
presented images (i.e., reward anticipation rather than out- 
come) no feedback was given on a trial by trial basis. 
Instead subjects were informed about their overall per- 
formance after each session (containing 2 blocks with each 
contingency). Prior to the experiment the subjects were 
instructed to respond as quickly and as correctly as possi- 
ble and that only 20% of all earnings would be paid. 

All images were gray-scaled and normalized to a mean 
gray-value of 127 and a standard deviation of 75. None of 
the scenes depicted human beings or parts of human 
beings including faces in the foreground. 



Training Sessions 

Each subject performed two training sessions prior to 
the experiment. Similar to the actual experiment both 
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Figure I. 

Experimental design. 



training phases began with a familiarization phase, during 
which only 10 images were presented twice in random 
order (duration = 1.5 s; ISI = 3 s) and subjects indicated 
their indoor /outdoor status. As was the case for the main 
experiment, familiarization was followed by a memory 
based preference judgment task including familiar and 
novel images. For training purposes, in training session 1 
a feedback was given on a trial-by-trial basis after each 
response. In training session 2 reward feedback was not 
shown immediately after each stimulus/response. Follow- 
ing each training session, the subject's financial reward 
(maximum £1) was reported to the subject. In Experiment 
2, subjects also received a brief training session containing 
10 familiar and 10 novel images per response contingen- 
cies block. 



One day later, subjects performed an incidental recogni- 
tion memory test following the "remember/know" proce- 
dure [Tulving, 1985]. Here, in random order all 240 
previously seen pictures (60 per condition) were presented 
together with 60 new distractor pictures on the center of a 
computer screen. Task: The subject first made an "old/ 
new" decision to each individually presented picture using 
their right index or middle finger. Following a "new" deci- 
sion, subjects were prompted to indicate whether they 
were confident ("certainly new") or unsure ("guess"), 
again using their right index and middle finger. After an 
"old" decision, subjects were prompted to indicate if they 
were able to remember something specific about seeing 
the scene at study ("remember response"), just felt famili- 
arity with the picture without any recollective experience 
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("familiar" response) or were merely guessing that the pic- 
ture was an old one ("guess" response). The subject had 4 
s to make each of both judgments and there was a break 
of 15 s after every 75 pictures. 



fMRI Methods 

We performed fMRI on a 3-Tesla Siemens Allegra mag- 
netic resonance scanner (Siemens, Erlangen, Germany) 
with echo planar imaging (EPI) using a quadrature trans- 
ceiver coil with a design based on the "birdcage" princi- 
ple. In the functional session 48 T2*-weighted images (EPI- 
sequence; covering the whole head) per volume with 
blood oxygenation level-dependent (BOLD) contrast were 
obtained (matrix size: 64 x 64; 48 oblique axial slices per 
volume angled at —30° in the antero-posterior axis; spatial 
resolution: 3x3x3 mm; TR = 3120 ms; TE = 30 ms; z- 
shimming pre-pulse gradient moment of PP = 0 mT/ 
m*ms; positive phase-encoding polarity). The fMRI acqui- 
sition protocol was optimized to reduce susceptibility- 
induced BOLD sensitivity losses in inferior frontal regions 
and temporal lobe regions [Deichmann, et al. 2003; Weis- 
kopf, et al. 2006]. For each subject functional data were 
acquired in three scanning sessions containing 180 vol- 
umes per session. Six additional volumes per session were 
acquired at the beginning of each series to allow for steady 
state magnetization and were subsequently discarded 
from further analysis. Anatomical images of each subject's 
brain were collected using multi-echo 3D FLASH for map- 
ping proton density, Tl and magnetization transfer (MT) 
at 1 mm resolution [Helms, et al. 2009; Weiskopf and 
Helms, 2008] and by Tl weighted inversion recovery pre- 
pared EPI (IR-EPI) sequences (matrix size: 64 x 64; 64 sli- 
ces; spatial resolution: 3x3x3 mm). Additionally, 
individual field maps were recorded using a double echo 
FLASH sequence (matrix size = 64 x 64; 64 slices; spatial 
resolution = 3x3x3 mm; gap = 1 mm; short TE = 10 
ms; long TE = 12.46 ms; TR = 1020 ms) for distortion cor- 
rection of the acquired EPI images [Weiskopf, et al. 2006]. 
Using the "FieldMap toolbox" [Hutton, et al. 2002, 2004] 
field maps were estimated from the phase difference 
between the images acquired at the short and long TE. 

The fMRI data were preprocessed and statistically ana- 
lyzed using the SPM5 software package (Wellcome Trust 
Centre for Neuroimaging, University College London, UK) 
and MATLAB 7 (The MathWorks, Inc., Natick, MA). All 
functional images were corrected for motion artifacts by 
realignment to the first volume; corrected for distortions 
based on the field map [Hutton, et al. 2002]; corrected for 
the interaction of motion and distortion using the 
"Unwarp toolbox" [Andersson, et al. 2001; Hutton, et al. 
2004]; spatially normalized to a standard Tl-weighted 
SPM-template [Ashburner and Friston, 1999] (care was 
taken that in particular midbrain regions aligned with the 
standard-template); re-sampled to 2 x 2 x 2 mm; and 
smoothed with an isotropic 4 mm full-width half-maxi- 



mum Gaussian kernel. Such fine-scale spatial resolution in 
combination with a relatively small smoothing kernel is 
the basis for being able to detect small clusters of activa- 
tion, for instance within the midbrain and MTL regions 
where differential activation patterns (i.e., novelty 
responses and interactions between novelty and reward) 
might be located in close proximity [Bunzeck, et al. 2010]. 
The fMRI time series data were high-pass filtered (cutoff 
= 128 s) and whitened using an AR(l)-model. For each 
subject an event-related statistical model was computed by 
creating a "stick function" for each event onset (duration 
= 0 s), which was convolved with the canonical hemody- 
namic response function combined with time and disper- 
sion derivatives [Friston, et al. 1998]. Modeled conditions 
included novel-rewarded, novel-not-rewarded, familiar- 
rewarded, familiar-not-rewarded and incorrect responses. 
To capture residual movement-related artifacts six covari- 
ates were included (the three rigid-body translation and 
three rotations resulting from realignment) as regressors of 
no interest. Regionally specific condition effects were 
tested by employing linear contrasts for each subject and 
each condition (first-level analysis). The resulting contrast 
images were entered into a second-level random-effects 
analysis. Here, the hemodynamic effects of each condition 
were assessed using a 2 x 2 analyses of variance 
(ANOVA) with the factors "reward" (rewarding, not 
rewarding) and "novelty" (novel, familiar). This model 
allowed us to test for main effects of novelty, main effects 
of reward and the interaction between both. All contrasts 
were thresholded at P = 0.001 (uncorrected) except the 
regression analyses (P = 0.005, uncorrected). Both rela- 
tively liberal thresholds were chosen based on our precise 
a priori anatomical hypotheses within the mesolimbic 
system. 

The anatomical localization of significant activations was 
assessed with reference to the standard stereotaxic atlas by 
superimposition of the SPM maps on one of two group 
templates. A Tl-weighted and a MT-weighted group tem- 
plate were derived from averaging all subjects' normalized 
Tl or MT images (spatial resolution of 1 x 1 x 1 mm). 
While the Tl-template allows anatomical localization out- 
side the midbrain on MT-images the SN/VTA region can 
be distinguished from surrounding structures as a bright 
stripe while the adjacent red nucleus and cerebral 
peduncle appear dark [Bunzeck and Duzel, 2006; Bunzeck, 
et al. 2007; Eckert, et al. 2004]. 

Note that we prefer to use the term SN/VTA and con- 
sider BOLD activity from the entire SN/VTA complex for 
several reasons [Duzel, et al. 2009]. Unlike early formula- 
tions of the VTA as an anatomical entity, different dopa- 
minergic projection pathways are dispersed and 
overlapping within the SN/VTA complex. In particular, 
dopamine neurons that project to the limbic regions and 
regulate reward-motivated behavior are not confined to 
the VTA but they are distributed also across the SN (pars 
compacta) [Gasbarri, et al. 1994, 1997; Ikemoto, 2007; Smith 
and Kieval, 2000]. Functionally, this is paralleled in the 
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TABLE 1. Behavioral results 






Experiment I 


Experiment II 




Rewarding-hits 


Not rewarding-CorRej 


Rewarding-hits Not rewarding-CorRej 


Novel 
Familiar 


0.88 (0.08) 
0.9 (0.09) 


0.91 (0.1) Novel 
0.87 (0.09) Familiar 


0.81 (0.09) 0.84 (0.1) 
0.86 (0.06) 0.85 (0.09) 



Table shows the hit-rate or correct rejection rate per condition (second line per cell) for Experiment I and Experiment II. Numbers in 
brackets indicate one standard deviation of the mean. 



fact that in humans and primates DA neuron within the 
SN and VTA respond to both reward and novelty [see for 
instance Ljungberg, et al., 1992 or Tobler, et al., 2003 for a 
depiction of recording sites]. 



RESULTS 

All analyses (behavioral and fMRI) are based on trials 
with correct preference responses. 

Experiment I 

Subjects discriminated between conditions in both con- 
texts with high accuracy (Table I) and there were no statis- 
tically significant differences between conditions. Reaction 
time (Fig. 2A) analysis revealed that subjects responded 
fastest to familiar reward predicting stimuli (all P's < 
0.007), but there was no difference between the other three 
conditions (novel-rewarded, novel-not-rewarded, familiar- 
not-rewarded; all P's > 0.05). 

Recognition memory performance-second day. Recogni- 
tion memory analysis was based on both hits (remember 
responses, know responses following pictures previously 
seen during encoding), and false alarms ([FA]: remember, 
know to distractors). In a first step, we calculated the pro- 



portion of remember- and know-responses for old and 
new images (i.e., hit-rates and FA-rates) by dividing the 
number of hits (and FA, respectively) by the number of 
items per condition. Secondly, corrected hit-rates were 
obtained for remember-responses ([Rcorr], remember hit- 
rate minus remember FA-rate) and know-responses 
([Kcorr], know hit-rate minus know FA-rate) (see Table II). 
In a planned comparison, we assessed the effect of reward 
on overall recognition memory (corrected hit-rate = Rcorr 
+ Kcorr) for novel and familiar images. This revealed that 
reward significantly improved overall memory for novel 
images compared to novel not rewarded images (P = 
0.036) but there was no such improvement of overall 
memory by reward for familiar images (P > 0.5; Fig. 2). 
Furthermore, the enhancing effect of reward on recogni- 
tion memory for novel images was equally strong for rec- 
ollection and familiarity as revealed by analysis of 
variance (ANOVA; no interaction between reward and rec- 
ognition memory type [F(l,16) = 2.28, P > 0.15)]. 



Experiment 2 

As in Experiment 1, subjects discriminated between con- 
ditions in both contexts with high accuracy and no signifi- 
cant differences between conditions (Table I). As in 
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Figure 2. 



Behavioral results. (A) Reaction-times. In both experiments RTs 
were significantly faster for familiar rewarded images compared 
to all other conditions (all P < 0.0 1) — as indicated by the aster- 
isk — but there was no other difference between conditions. (B) 
Recognition memory performance in Experiments I and 2. The 



bars show overall recognition memory scores (corrected hit- 
rate = correct remember plus correct know responses) in the 
memory test on the next day. Error-bars denote one standard 
error of the mean and asterisk indicates a statistically significant 
difference (P < 0.05). 
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TABLE II. Recognition memory 



Novel rewarded 


Novel not-rewarded 


Familiar rewarded 


Familiar not-rewarded 


Recollection 
(Rcorr) 


Familiarity 
(Fcorr) 


Recollection 
(Rcorr) 


Familiarity 
(Fcorr) 


Recollection 
(Rcorr) 


Familiarity 
(Fcorr) 


Recollection 
(Rcorr) 


Familiarity 
(Fcorr) 


Experiment I 

0.092 (0.016) 
Experiment II 

0.070 (0.021) 


0.131 (0.019) 
0.088 0.016) 


0.071 (0.019) 
0.063 (0.019) 


0.116 (0.023) 
0.067 (0.018) 


0.588 (0.044) 
0.396 (0.049) 


0.127 (0.041) 
0.167 (0.036) 


0.567 (0.039) 
0.322 (0.039) 


0.159 (0.041) 
0.183 (0.031) 



Table shows corrected recollection rate (Rcorr) and corrected familiarity rate (Fcorr) for all conditions and both experiments. Numbers 
in brackets indicate one standard error of the mean. 



Experiment 1, reaction-time (Fig. 2A) analysis showed 
responses were significantly faster for familiar reward pre- 
dicting stimuli (all P's < 0.001) but there was no difference 
between the other three conditions (novel-rewarded, 
novel-not-rewarded, familiar-not-rewarded; all P's > 0.05). 

Recognition memory performance-second day. In con- 
trast to Experiment 1, recognition memory for novel 
rewarded images was not significantly improved com- 
pared to novel unrewarded images (neither overall recog- 
nition memory nor Rcorr/Kcorr; P > 0.05, Table II). Also 
in contrast to Experiment 1, in Experiment 2 recollection 
for familiar rewarded images was significantly enhanced 
compared to familiar not-rewarded images (P = 0.001, Ta- 
ble II) which resulted in enhanced overall memory (Rcorr 
+ Kcorr) for familiar rewarded compared to familiar not- 



rewarded images (there was no significant difference 
between the corrected know-rates of familiar rewarded 
and familiar not-rewarded images, P > 0.05). Furthermore, 
data in Table II and Figure 2B shows that overall memory 
performance was considerably lower in Experiment 2 com- 
pared to Experiment 1, which was supported by a mixed 
effects ANOVA. 

fMRI results— reward based recognition memory test. 
First, we analyzed fMRI data using a 2 x 2 ANOVA with 
factors "novelty" (novel, familiar) and "reward" (reward, 
no reward). We found a main effect of novelty in bilateral 
medial orbitofrontal cortex (mOFC) and the right MTL 
including the hippocampus and rhinal cortex, (Fig. 3; see 
Supporting Information Table SI for a complete list of acti- 
vated brain structures). A main effect of reward was 



Main Effect of Novelty 





Figure 3. 

fMRI results Experiment 2. A main effect of novelty was thresholded at P 
observed within the right hippocampus (A), rhinal cortex (B) 
and medial OFC (C). Activation maps were superimposed on a 
T I -weighted group template (see methods), coordinates are 
given in MNI space and color bar indicates T-values (results 



0.001, uncorrected). Error-bars denote one 
standard error of the mean and asterisk indicates a statistically 
significant difference (P < 0.05). [Color figure can be viewed in 
the online issue, which is available at wileyonlinelibrary.com.] 
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fMRI results Experiment 2. A main 

observed within the striatum, including ncl. accumbens (A) and 
caudate ncl. (C), septum/fornix (B), medial PFC (C), and medial 
OFC (D). Activation maps were superimposed on a Tl- 
weighted group template (see methods), coordinates are given 
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Figure 4. 

effect of reward was in MNI space and color bar indicates T-values (results thresh- 
olded at P = 0.001, uncorrected). Error-bars denote one stand- 
ard error of the mean and asterisk indicates a statistically 
significant difference (P < 0.05). [Color figure can be viewed in 
the online issue, which is available at wileyonlinelibrary.com.] 
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fMRI results Experiment 2. An interaction between novelty and 
reward was observed within the hippocampus and OFC. Within 
the hippocampus responses to familiar not-rewarded items was 
enhanced compared to familiar-rewarded items if presented in 
context with novel-rewarding items. Activation maps were 
superimposed on a T I -weighted group template (see methods), 



coordinates are given in MNI space and color bar indicates F- 
values (results thresholded at P = 0.00 1, uncorrected). Error- 
bars denote one standard error of the mean and asterisk indi- 
cates a statistically significant difference (P < 0.05). [Color figure 
can be viewed in the online issue, which is available at 
wileyonlinelibrary.com.] 



observed within the bilateral caudate, septum/fornix, ven- 
tral striatum (ncl. accumbens), bilateral mOFC and medial 
prefrontal cortex (mPFC) (Fig. 4; Supporting Information 
Table SI). These two main effects were exclusively masked 
with the effects of interactions (exclusive masking, P = 
0.05, uncorrected) to identify only those regions that 
expressed main effects in the absence of any interaction. 

To test our two predictions regarding the exploration 
bonus hypothesis, we performed two additional analyses. 
First, within brain regions that showed a main effect of 
reward we analyzed, which areas also showed a stronger 
response for novel rewarded than familiar rewarded stim- 
uli (i.e., conjunction). This analysis did not yield any sig- 
nificant results suggesting that there were no brain regions 
where being novel lead to a stronger reward prediction 
response than being familiar. Secondly, we assessed the 
interaction (F-contrast) between novelty and reward. Such 
an interaction was expressed within several brain regions 
including right hippocampus, inferior frontal gyrus and 
right OFC (Supporting Information Table SI, Fig. 5). Spe- 
cifically, the hippocampus showed the expected interaction 
pattern with higher responses for stimuli presented in the 
context where being novel is rewarded (T-contrast). That 
is, hippocampal activity was higher for novel rewarded 
stimuli and familiar unrewarded stimuli (note that both of 
these stimuli were presented in the same context) than for 
novel unrewarded and familiar rewarded stimuli (again, 
note that both of these stimuli were presented in the same 
context). Planned post hoc comparison confirmed statisti- 
cally significant differences between novel-rewarded vs. 
novel not-rewarded (P < 0.025) and familiar rewarded vs. 
familiar not-rewarded (P < 0.01; Fig. 5). 



It should be noted that the activation pattern for the 
interaction between novelty and reward (36, —14, —16; Fig. 
5) is adjacent but not identical to the activation of a main 
effect of novelty, which is also located within the right 
hippocampus (28, —14, —20; Fig. 3). Such differential acti- 
vation pattern accords to our hypotheses, cell recordings 
in animals and human fMRI studies. For instance, animal 
research has shown that different hippocampal neurons 
can respond to different features (such as novelty or famil- 
iarity) within the same task [Brown and Xiang, 1998]. In 
line with these observations, we have shown in humans 
that spatially distinct hippocampal activations can reflect 
differential properties of novelty processing, absolute nov- 
elty signals, adaptively scaled novelty signals and novelty 
prediction errors, ([Bunzeck, et al. 2010], Supporting Infor- 
mation Fig. S4). Johnson et al. (2008) reported that spa- 
tially very close clusters of activation showed very 
different responses to novelty: one cluster showed a cate- 
gorical difference between new items and old items 
whereas the other cluster showed a linear response decre- 
ment as a function of increased stimulus familiarity. How- 
ever, to further exclude the possibility of a false positive 
result we applied small volume correction to both activa- 
tion patterns using the right anterior hippocampus as vol- 
ume. The analysis reached statistical significance (P < 0.05; 
FWE-corrected) . 

Finally, we sought to link reward related memory 
improvement to regional brain activity patterns using 
regression analyses (all analyses were performed with 
data from Experiment 2). First, the contrast novel 
rewarded vs. novel not-rewarded images was entered into 
a second-level simple regression analysis using individual 
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memory improvement by reward as regressor (A corrected 
hit-rate = corrected hit rate [Rcorr + Fcorr] for novel- 
rewarded - corrected hit-rate for novel not-rewarded). 
This analysis was motivated by our initial observation of 
improved overall memory (i.e., recollection and familiar- 
ity) for novel images by reward (Experiment 1) and previ- 
ous similar findings [Adcock, et al. 2006; Krebs, et al. 2009; 
Wittmann, et al. 2005]. This revealed a significant positive 
correlation between hemodynamic responses (HR) and 
recognition memory improvement within the SN/VTA, 
right anterior MTL (junction of rhinal cortex hippocam- 
pus/amygdala) and right ventral striatum (Fig. 6, Support- 
ing Information Table SI for all activated regions). In a 
second regression analysis, the same contrast for familiar 
images (familiar rewards vs. familiar not-reward) was cor- 
related with individual improved recollection rate (behav- 
iorally, recollection rate was significantly enhanced for 
familiar rewarded compared to not-rewarded images but 
there was no improvement in Fcorr). Since RTs for familiar 
rewarded images were significantly faster than for familiar 
not-rewarded images the difference between both for each 
subject was also entered as regressor. Here, we were only 
interested in those regions that showed a significant posi- 
tive correlation between HR differences (familiar rewarded 
vs. familiar not rewarded) and increased recollection rate 
(familiar rewarded vs. familiar not-rewarded) but not 
those that also showed any correlation with RT improve- 
ment. This analysis revealed similar effects to the first 
regression analysis, namely, a significant correlation 
between HR and reward-related recollection-rate improve- 
ment within the ventral striatum (left), right hippocampus 
and left rhinal cortex (Fig. 7, Supporting Information Table 
SI), but no correlation within the SN/VTA. A statistically 
more sensitive post hoc analysis of the SN/VTA voxel [4, 
—18, —16] that showed a significant correlation for novel 
images also revealed no correlation between hemodynamic 
responses and improved recollection rate for familiar 
images (r = -0.07, P = 0.811). 



DISCUSSION 

Our finding that a cluster of voxels within the MTL 
(including hippocampus and rhinal cortex) showed a main 
effect of novelty but not a main effect of reward (Fig. 
3A,B), supports the idea that the hippocampus and rhinal 
cortex can signal novelty independent of reward-value. 
This finding accords with a wide range of animal and 
human studies suggesting that both the hippocampus and 
rhinal cortex are sensitive to novelty [Brown and Xiang, 
1998; Dolan and Fletcher, 1997; Knight, 1996; Lisman and 
Grace, 2005; Strange, et al. 1999; Yamaguchi, et al. 2004]. 
However, another region within the hippocampus also 
showed the hypothesized interaction of novelty and 
reward (Fig. 5) with significantly enhanced hemodynamic 
responses to familiar unrewarded images if presented in a 
context where being novel was rewarded. 



This interaction of novelty and reward in the hippocam- 
pus provides evidence for our second prediction of a con- 
textual effect in accordance with the exploration bonus 
framework (see [Sutton and Barto, 1981] for a formal 
description of the exploration bonus within the explora- 
tion-exploitation dilemma). Based on the notion that nov- 
elty can act as an exploration bonus for reward [Kakade 
and Dayan, 2002] we predicted that in a context in which 
being novel is rewarded there should be enhanced explo- 
ration also of the familiar stimuli (even when they are 
unrewarded). Compatible with this possibility, familiar 
stimuli elicited stronger hippocampal activity in a context 
where the availability of reward was signaled by being 
novel as compared to a context where reward is signaled 
by being familiar. This contextually enhanced neural acti- 
vation within the hippocampus during encoding, however, 
did not directly translate into long-term memory, that is, 
better memory for familiar items when presented in con- 
text with novel reward predicting items. Instead, recogni- 
tion performance was driven by the reward predicting 
status of an item both for novel (Experiment 1) and famil- 
iar (Experiment 2) stimuli (see below). This suggests that, 
in an experimental setting in which reward prediction and 
contextual novelty may both influence learning, reward 
prediction can exert the dominance influence. 

Another prediction regarding the exploration bonus 
framework was not confirmed. We did not find any brain 
regions which exhibited a main effect of reward and at 
the same time a significantly stronger activity for novel 
rewarded than familiar rewarded images. At the first 
glance, this negative finding seems to be at odds with 
previous studies [Krebs, et al. 2009; Wittmann, et al. 
2008]. However, in both, the Krebs et al. [2009] and the 
Wittmann et al. [2008] study, enhanced reward predic- 
tion for novel stimuli was found under conditions where 
the novelty status of stimuli was implicit and partici- 
pants attended to reward contingencies. In fact, Krebs 
et al. reported that this enhancement was absent when 
participants attended to the novelty status of stimuli 
rather than attending to reward contingencies (note how- 
ever, that in Krebs et al. novelty status per se was not 
predictive of reward). Hence, unlike the contextual inter- 
action between novelty and reward (Fig. 5), this aspect 
of the exploration bonus may be strongly task-dependent 
occurring only when subjects can attend to reward con- 
tingencies without having to assess novelty. It has been 
suggested on the basis of rodent studies that prefrontal 
and hippocampal inputs compete with each other for 
control over the nucleus accumbens (a part of the ventral 
striatum) [Goto and Grace, 2008]. It is plausible that 
task-related attention to novelty or reward would affect 
such a competition. 

Recognition memory scores from Experiment 1 (Fig. 2) 
were well compatible with the exploration bonus frame- 
work in showing a reward-related behavioral enhance- 
ment of long-term memory performance for novel but not 
for familiar stimuli. However, the behavioral results 
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Figure 6 

fMRI results Experiment 2-regression analysis. A significant cor- 
relation between recognition memory improvement for novel 
rewarded compared to not-rewarded images (A corrected hit- 
rate) and hemodynamic response differences between novel 
rewarded and novel not-rewarded images (parameter estimates, 
beta) was exhibited in bilateral medial SN/VTA (A), right MTL 



(B), and ventral striatum (C). Activation maps were superim- 
posed on a MT (A) and T I -weighted (B, C) group template (see 
methods), coordinates are given in MNI space and color bar 
indicates T-values (results thresholded at P = 0.005, uncor- 
rected). [Color figure can be viewed in the online issue, which 
is available at wileyonlinelibrary.com.] 
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Figure 7. 



fMRI results Experiment 2-regression analysis. A significant cor- 
relation between recollection rate improvement for familiar 
rewarded compared to familiar not-rewarded images (A recol- 
lection rate) and hemodynamic response differences between fa- 
miliar rewarded and familiar not-rewarded images (parameter 



pus (A) and left rhinal cortex (B), and left ventral striatum (C). 
Activation maps were superimposed on a T I -weighted group 
template (see methods), coordinates are given in MNI space and 
color bar indicates T-values (results thresholded at P = 0.005, 
uncorrected). [Color figure can be viewed in the online issue, 



estimates, beta) was observed in MTL including right hippocam- which is available at wileyonlinelibrary.com.] 
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Figure 8. 

Schematic illustration of the functional relationship between hip- time series which results 
pocampus, Nucleus accumbens (NAcc), medial prefrontal cortex 
(mPFC) and substantia nigra/ventral tegmental area (SN/VTA). 
To provide support for this model, we calculated a correlation 
between the activation of our regions of interest, using a Spear- 
man correlation analysis for each subject on the deconvolved 



in a group correlation coefficient R 
and a P-value. It should be noted that the arrows indicate 
assumed directionality on the basis of known projections rather 
than quantitatively estimated causality. [Color figure can be 
viewed in the online issue, which is available at 
wileyonlinelibrary.com.] 



obtained under conditions where encoding occurred in the 
fMRI scanner (Experiment 2) were different in that mem- 
ory for familiar stimuli did show an enhancement by 
reward (for novel stimuli this enhancement did not reach 
significance). One reason for this discrepancy may be that 
in Experiment 1, the encoding context and the retrieval 
context on the next day were identical (subjects learned 
and were tested in the same room) whereas for Experi- 
ment 2 they were different (subjects encoded in the fMRI 
and were tested in a testing room). It is well-known that 
changes between encoding and retrieval context can have 
profound influences on memory performance [Godden 
and Baddeley, 1975]. Compatible with this possibility, 
memory performance was considerably lower in Experi- 
ment 2 than in Experiment 1 (Fig. 2). Such context effects 
may have also led to the discrepancy in the behavioral 
patterns observed in Experiments 1 and 2. 

The ventral striatum (Fig. 4A) and medial prefrontal 
cortex (Fig. 4 C,D) expressed main effects of expected 
reward value. In our task reward-prediction depended 
upon explicit novelty discrimination and thus it is appa- 
rent that regions expressing expected reward value (ven- 
tral striatum, septum/fornix) require access to 
information about memory for the presented picture. A 
likely origin of such declarative memory information is 
the MTL. In fact, hippocampus and rhinal cortex, as part 
of the MTL, not only expressed the main effect of nov- 
elty, but they are also well-known to send efferents to the 
ventral striatum and the medial prefrontal cortex (note 
that projection from rhinal cortex to the NAcc stem pri- 
marily from the entorhinal cortex [Friedman, et al. 2002; 



Selden, et al. 1998; Thierry, et al. 2000]). The precise 
mechanisms and computational processes, however, 
which may be implicated in translating novelty into 
reward responses, are unclear. This possibly involves the 
medial prefrontal cortex (including orbital parts) which- 
in line with previous studies [O'Doherty, et al. 2004; Ran- 
ganath and Rainer, 2003]-expressed both novelty and 
reward related activation (Fig. 3C and 4C,D). 

The functional implications of our results regarding the 
representation of novelty and reward responses in the 
hippocampus, SN/VTA, ventral striatum and medial PFC 
are summarized in Figure 8. To provide support for this 
model, we calculated a correlation between the activation 
of our regions of interest, using a Spearman correlation 
analysis for each subject on the deconvolved time series, 
to provide a group correlation coefficient R and a P- 
value. 

Since reward was contingent upon novelty and the sole 
region that represented both types of signals was the 
mPFC, this region is likely to be the source of novelty- 
based reward signaling (R = 0.09; P < 0.001). The hippo- 
campus, on the other hand, is most likely the source of the 
novelty signal for the mPFC (R = 0.11; P < 0.001). This is 
plausible given that there are direct projections from the 
hippocampus to mPFC [Ferino, et al. 1987; Rosene and 
Van Hoesen, 1977]. It is also plausible that the mPFC 
reward signal is then conveyed to the NAcc (R = 0.09; P 
< 0.001) and the SN/VTA (R = 0.03; P = 0.08). It should 
be noted that the SN/VTA signal only correlated with the 
novelty responsive mPFC (R = 0.03; P = 0.08) but not the 
reward responsive mPFC (R = 0.007; P > 0.6). This 
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suggests that mOFC inputs to the SN/VTA might arise 
more strongly from those mPFC regions associated with 
novelty processing rather than reward processing. Our ob- 
servation that the mPFC responds to novelty and corre- 
lates with the SN/VTA signal is also compatible with the 
suggestion [Lisman and Grace, 2005] that the PFC is a 
source of a novelty signal into dopaminergic circuitry. The 
role of the NAcc in novelty signaling, however, still 
remains unclear [Duzel, et al. 2009]. That is, although we 
did not observe novelty signals within the NAcc there was 
a strong correlation between signals in the NAcc and nov- 
elty responsive mOFC regions (R = 0.09; P < 0.001), NAcc 
and novelty responsive hippocampus regions (R = 0.15; P 
< 0.001), and the NAcc and SN/VTA (R = 0.19; P < 
0.001). Finally, it should be noted that the arrows in our 
model indicate assumed directionality on the basis of 
known projections rather than quantitatively estimated 
causality. 

Reward related improvement of recognition memory 
was correlated with ventral striatum, SN/VTA and MTL 
activation (Fig. 6). An important aspect of hippocampal 
learning and plasticity is a requirement for DA in the 
expression of the late phase LTP (long-term potentiation) 
but not early phase LTP [Frey and Morris, 1998; Frey, 
et al. 1990; Huang and Kandel 1995; Jay 2003; Morris 
2006]. This supports a view that DA is required for long- 
term memory consolidation, which is supported by recent 
behavioral data in rodents [O'Carroll, et al. 2006]. Our 
data are compatible with this view in showing a correla- 
tion between long-term memory improvement through 
reward one day after encoding and activation within puta- 
tive dopaminergic regions and hippocampus. In particular, 
we see a correlation for novel rewarded vs. not-rewarded 
items within SN/VTA, ventral striatum and hippocampus 
and a correlation for familiar rewarded vs. nonrewarded 
items within ventral striatum and hippocampus. Given 
that the ventral striatum is a primary output structure of 
the dopaminergic midbrain (SN/VTA) [Fields, et al. 2007] 
our results suggests that an ability to observe a reward- 
related enhancement of long-term memory through the 
hippocampal-SN/VTA is not limited to novel stimuli but 
also applies to familiar stimuli. In fact, it is likely that the 
degree of familiarity among the class of familiar stimuli 
(during encoding) was quite variable and that those stim- 
uli whose encoding benefited most from reward were the 
least familiar (relatively most novel) ones. Therefore it is 
reasonable to assume that correlations for the novel and 
familiar stimulus classes were driven by the same 
mechanisms. 

We also observed a main effect of reward in the sep- 
tum/fornix (Fig. 4B), a region that is likely to harbor cho- 
linergic neurons which project to medial temporal 
structures. Interestingly, animal studies show that similar 
to DA neurons, cholinergic neurons (in the basal forebrain) 
respond to novelty and habituate when stimuli become fa- 
miliar [Wilson and Rolls, 1990b]. However, in tasks in 
which familiar stimuli predict reward, the activity of basal 



forebrain neurons reflect reward-prediction rather than 
novelty status [Wilson and Rolls, 1990a]. Our findings 
(Fig. 4B) are compatible with the observation of Wilson 
and Rolls (1990a) although we cannot say to what extent 
these activations actually involve responses of cholinergic 
neurons. 

Taken together, we replicate recent observations that ac- 
tivity of the ventral striatum, SN/VTA, hippocampus and 
rhinal cortex correlated with reward-related memory 
enhancement compatible with the hippocampus-SN/VTA 
loop. Importantly, our findings provide new key insights 
into the functional properties of the components of this 
loop. In a task in which the novelty status of an item pre- 
dicted reward the hippocampus preferentially expressed 
the novelty status whereas ventral striatum activity 
reflected the reward value independently of novelty status. 
The medial PFC (including orbital parts) was likely to be 
the site where novelty and reward signals were integrated 
because it expressed both novelty and reward effects and 
is known to be connected with the hippocampus and ven- 
tral striatum. Finally, in line with the exploration bonus 
theory [Kakade and Dayan, 2002] novel reward predicting 
stimuli exerted contextually enhancing effects on familiar 
(not rewarding) items, which were expressed as enhanced 
neural responses within the hippocampus. 
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